So the next thing we did was we extended that to linear classifiers.
So essentially the same method, but a different problem.
Instead of having a certain linear model that describes all points, we classify the points
into good and bad ones.
And we're looking for a separating line
between the good and the bad ones.
And instead of finding a linear model which
is given by a linear function being equal to 0,
we're interested in the two half planes
with the linear function being greater than 0 or less than 0,
which is exactly what the linear separator gives us.
And if we're lucky, the good and the bad points
are actually linearly separable, which is not always the case.
It could be that one of the black ones would just
lie up here, then we have non-separable data.
And unless we discount for the data,
we don't have a separation.
And the problem becomes unsolvable.
And we have to see what to do.
But right now, we are actually looking for separable data.
So you go through all the motions.
And the key idea here is that we get exactly the same equation.
Here we have it.
But what we're doing is we are passing this term,
x dot equal dot product w.
We're passing that real value through a threshold function.
And the main problem we're encountering here
is whether we use a hard threshold or a smooth threshold.
And kind of unsurprising that if we take some kind of that,
no, that doesn't work.
If we take a sigmoid function, for instance,
the standard logistic function, then calculus techniques
work better.
This thing here is discontinuous.
This is smooth, meaning arbitrarily differentiable.
And we can do calculus.
And that's really the only thing to remember.
You go from regression techniques
to classification techniques via a threshold.
And if you invest a little bit more
into a threshold that is sigmoidal,
then you're going to get further with your techniques.
Apart from that, everything is like in the regression case.
The math stays exactly the same.
We get the same update rules here.
And the thing to remember is that if we
have the hard threshold, we call the whole thing perceptron,
which is what we're going to see again.
Or we get relatively random conversions
if we take, say, the logistic function,
Presenters
Zugänglich über
Offener Zugang
Dauer
00:04:55 Min
Aufnahmedatum
2021-03-30
Hochgeladen am
2021-03-31 11:17:00
Sprache
en-US
Recap: Regression and Classification with Linear Models (Part 3)
Main video on the topic in chapter 8 clip 13.